Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract This article introduces a general processing framework to effectively utilize waveform data stored on modern cloud platforms. The focus is hybrid processing schemes for which a local system drives processing. We show that downloading files and doing all processing locally is problematic even when the local system is a high-performance computing (HPC) cluster. Benchmark tests with parallel processing show that approach always creates a bottleneck as the volume of data being handled increases with more processes pulling data. We find a hybrid model for which processing to reduce the volume of data transferred from the cloud servers to the local system can dramatically improve processing time. Tests implemented with the Massively Parallel Analysis System for Seismology (MsPASS) utilizing Amazon Web Service’s (AWS) Lambda service yield throughput comparable to processing day files on a local HPC file system. Given the ongoing migration of seismology data to cloud storage, our results show doing some or all processing on the cloud will be essential for any processing involving large volumes of data.more » « lessFree, publicly-accessible full text available August 20, 2026
-
Ruppert, Natalia A; Jadamec, Margarete A; Freymueller, Jeffrey T (Ed.)Free, publicly-accessible full text available November 27, 2025
-
Ruppert, Natalia A; Jadamec, Margarete A; Freymueller, Jeffrey T (Ed.)Free, publicly-accessible full text available November 27, 2025
-
Ruppert, Natalia A; Jadamec, Margarete A; Freymueller, Jeffrey T (Ed.)Free, publicly-accessible full text available November 27, 2025
-
This article introduces a new framework for seismic data processing and management we call the Massive Parallel Analysis System for Seismologists (MsPASS). The framework was designed to enable new scientific frontiers in seismology by providing a means to more effectively utilize massively parallel computers to handle the increasingly large data volume available today. MsPASS leverages several existing technologies: (1) scalable parallel processing frameworks, (2) NoSQL database management system, and (3) containers. The system leans heavily on the widely used ObsPy toolkit. It automates many database operations and provides a mechanism to automatically save the processing history for reproducibility. The synthesis of these components can provide flexibility to adapt to a wide range of data processing workflows. We demonstrate the system with a basic data processing workflow applied to USArray data. Through extensive documentation and examples, we aim to make this system a sustainable, open‐source framework for the community.more » « less
An official website of the United States government

Full Text Available